NPS-53-85-0008 



.NAVAL POSTGRADUATE SCHOOL 



Monterey, California 



LAPLACIAN SMOOTHING SPLINES WITH 
GENERALIZED CROSS VALIDATION FOR 
OBJECTIVE ANALYSIS OF METEOROLOGICAL DATA 



Richard Franke 

o 

August 1985 

Technical Report For Period 
October 1984 - March 1985 



Approved for public release; distribution unlimited 
Prepared for; 

Naval Environmental Prediction Research Facility 
■' ’ ?y , CA 93943 




by 



FedDocs 
D 208.14/2 
NPS-53-85-0008 



NAVAL POSTGRADUATE SCHOOL 
MONTEREY CALIFORNIA 93943 



R. H. Shumaker 

Rear Admiral, U. S. Navy 

Superintendent 



This work was funded by the Naval Environmental Prediction 
Monterey, CA under Program Element 61153N, Project (none). 
Scattered Meteorological Data". 



D. A. Schrady 
Provost 



Research Facility, 
"Interpolation of 



Reproduction of all or part of this report is authorized. 



UNCLASSIFIED 



SECURITY CLASSIFICATION OF THIS PAGE fWhwi Rata Entered) 



REPORT DOCUMENTATION PAGE 


READ INSTRUCTIONS 
BEFORE COMPLETING FORM 


L REPORT NUMBER 2. GOVT ACCESSION NO. 

NPS-53-85-0008 


3- RECIPIENT'S CATALOG NUMBER 


4. TITLE (end Submit) 

Laplacian Smoothing Splines with Generalized 
Cross Validation for Objective Analysis of 
Meteorological Data 


5. TYPE OF REPORT & PERIOD COVERED 

Technical Report 
Interim, FY 1985 


6. PERFORMING ORG. REPORT NUMBER 


7. AUTHORS 

Richard Franke 


0. CONTRACT OR GRANT NUMBERfaj 


9. PERFORMING ORGANIZATION NAME AND AOORESS 

Naval Postgraduate School 
Monterey, CA 93943 


10. PROGRAM ELEMENT. PROJECT, TASK 
AREA & WORK UNIT NUMBERS 

Program Element 61153N 


M. CONTROLLING OFFICE NAME AND ADDRESS 

Naval Environmental Prediction Research Facility 
Monterey, CA 93943 


12. REPORT DATE 

August 1985 


13. NUMBER OF PAGES 

29 


14. MONITORING AGENCY NAME 4 ADORESSff/ dl lie tent from Controlling Olllco) 

Naval Air Systems Command 
Washington, DC 20361 


15. SECURITY CLASS, (ot thte report) 

Unci ass i fied 


15*. DECLASSIFICATION/ DOWNGRADING 
SCHEDULE 


16. DISTRIBUTION STATEMENT (ot thie Report) 

Approved for public release; distribution unlimited 



17. DISTRIBUTION STATEMENT (ot the mbetrmct entered In Block 20, It different from Report) 



10. SUPPLEMENTARY NOTES 



19. KEY WORDS (Continue on reveree aide it neceeemry end Identity by block number) 

Objective analysis Laplacian smoothing splines 

Optimum interpolation 

Generalized cross validation 

20. ABSTRACT (Continue on reveree aide It neceaemry end Identify by block number) 

The use of Laplacian smoothing splines (LSS) with generalized cross valida- 
tion (GCV) to choose the smoothing parameter for the objective analysis 
problem is investigated. Simulated 500 mb pressure height fields are 
approximated from first-guess data with spatially correlated errors and 
observed values having independent errors. It is found that GCV does not 
allow LSS to adapt to variations in individual realizations, and that 
specification of a single suitable smoothing parameter value for all 



DO , jan M 73 1473 EDITION OF I NOV 65 IS OBSOLETE UNCLASSIFIED 

S-'N 0102- LF- 014- 6601 



SECURITY CLASSIFICATION OF THIS PAGE (Whan Data Bntarad) 



UNCLASSIFIED 

SECURITY CLASSIFICATION OF THIS PAGE (Whan D«« EntcrwQ 



BLOCK 20 - ABSTRACT 



realizations leads to smaller rms error overall. While the tests were per- 
formed in the context of data from a meteorology problem, it is expected the 
results carry over to data from other sources. A comparison shows that 
significantly better approximations can be obtained using LSS applied in a 
unified manner to both first-guess and observed values rather that in a 
correction to first-guess scheme (as in Optimum Interpolation) when the first- 
guess error has low spatial correlation. 



" N 0102* LF. 01 4- 6601 

UNCLASSIFIED 



SECURITY CLASSIFICATION OF This PAGEO***" D»f Enfrmd ) 



1 



I n tr oduct ion 



In numerical weather prediction, objective analysis is the 
process of combining information obtained from observations of 
meteorological variables with that from the numerical prediction 
process. The resulting "analyzed" values are used to prepare 
weather maps, as well as to initialize the variables for the next 
weather prediction cycle. The problem is inherently a multivar- 
iate one since the variables are not independent, e.g., pressure 
heights are related to winds. The predicted values are on a 
regular grid, and have errors which are spatially correlated. 

The observed values are measured imperfectly, and occur at irreg- 
ularly spaced (scattered) points (both in space and time). The 
errors in the observations sometimes occur independently, with 
zero mean, and in other cases, such as satellite observations, 
are biased with correlated errors. 

The traditional approach to the problem is a two step 
process. The predicted values are treated as a first-guess and 
interpolated from the grid to the observation points. The dif- 
ference between the first-guess values interpolated to the obser- 
vation points and the observed values, called the first-guess 
error, is then interpolated back to the grid points as a correc- 
tion to the first-guess values. The interpolation from grid-to- 
observation points is the "easy” process, and has not received 
much attention in the literature. The procedure generally used 
is multilinear interpolation (e.g., Bergman, 1979, or Lofenc, 
1931), although recent investigations by the author (Franke, 

1935) have demonstrated that appreciable error may occur in this 



1 



step. The interpolation from observation-to-grid points is the 
"hard" problem and has received widespread attention. Histor- 
ically the favored scheme has been a weighted average scheme, 
originally introduced by Cressman (1959), with a variation due to 
Barnes (1973). Currently the method of choice is a statistical 
scheme known in the meteorological literature as Optimum Interp- 
olation (01), and in other disciplines by other names (e.g., 
Kriging in the mining and geology literature). 

The interpolation process known as 01 has its roots in the 
work of Weiner and Kolmogorov, and was introduced to the meteoro- 
logical literature by Gandin (1963). The theory of the process 
depends on it being applied to a random function with known 
spatial statistics. In particular it is assumed that the spatial 
covariance structure of the class of functions to which it is 
applied is known. In addition it is necessary to know the error 
statistics of the observation devices. If this is the case, then 
the process yields the best answer possible in the sense that the 
variance of the error is minimized over all functions in the 
class. For meteorological purposes, this means the covariance 
structure of an ensemble of realizations must be known, and then 
the mean squared error over the entire ensemble is minimized. 
Using standard least squares methods, the variance of the expect- 
ed error is easily computed, and much emphasis has been put on 
this as an advantage of the method. 

There have been numerous papers about the multivariate ap- 
plication of 01 to the objective analysis problem. These are of 
an applications nature, and it is difficult to separate the 
behavior of such schemes from that of the other involved 



2 



processes. In studies of objective analysis using simulated data 
to attempt to learn something about the properties of the scheme, 
many simplifications are required. This study is no different. 
The univariate (only one meteorological variable is treated, in 
this case the 500 mb pressure height surface) application of 01 
and other schemes is investigated. Because the generation of 
simulated data with specified spatial correlation properties 
requires the factorization of the correlation matrix for the 
first guess error at the grid points, it is necessary to work 
with a relatively small grid. Further, the problem of non- 
synoptic observation of variables is not treated, rather all 
observations are assumed made at the same time, the time at which 
the particular realization occurs. Within the prescribed limita- 
tions, the procedure used is valid and yields information about 
the objective analysis process which should prove to be useful in 
practice . 

A somewhat different way of looking at the problem was 
proposed by Wahba and Wendelberger (1980). See also Wendelberger 
(1931). In their work, no first guess was necessary or assumed; 
all data was considered to be observation values. Thus the 
underlying field to be approximated was treated directly, rattier 
than making a correction to the first-guess field. The overall 
process involved the use of Laplacian smoothing splines and 
generalized cross validation to determine a suitable value for 
the smoothing parameter. If a first guess is available, with 
known correlated errors, then ignoring this information is prob- 
ably unwise. The first-guess can be used in the traditional 
manner, with the Laplacian smoothing splines applied to the 



3 



first-guess error. It is also possible to apply the Laplacian 
smoothing splines to all of the data. Thus, part of the invest- 
igation reported here involved the use of Laplacian smoothing 
splines and generalized cross validation for the smoothing par- 
ameter in a scheme that approximates the underlying field direct- 
ly, but that also makes use of all available data in a way that 
accounts for the correlation of the errors. The program used was 
a modified version of the program MSSP, available from the 
Madison Academic Computing Center, University of Wisconsin. 

Section 2 gives an outline of the goals of this study, 
background information about the methods of objective analysis 
considered, and aspects of the schemes investigated. The results 
of the study are given and discussed in Section 3. Finally, the 
implications of the results and conclusions about approaches to 
objective analysis, and suggestions for further study are given 
in Section 4. 

2. Goals of the study 

This study had two principal goals: (1) To investigate the 
efficacy of generalized cross validation (GCV) in determining the 
smoothing parameter used in Laplacian smoothing splines (LSS), 
and (2) To test the possibility of treating first-guess values 
and observed values in a unified method with LSS. The smoothing 
parameter value must be given in order to use LSS, and Wahba and 
Wendelberger (I98d) have indicated that GCV might be a good way 
to choose the value. In this study I performed simulations to 
determine if GCV could adapt properly to particular realizations 
in an ensemble with specified error statistics. 



4 



The advantage of a unified scheme for both first-guess and 
observed values is that it potentially makes it possible to 
obtain better analyses where the observations are sparse compared 
to the grid or correlation distances. The LSS method used in 
this investigation was the scheme proposed by Wahba and 
Wendelberger (1980), which is described more fully in 
Wendelberger (1981, 1982). The general framework of this study 

follows that of a previous investigation (Franke, 1985). 

A brief description of the setting in which the numerical 
experiments were performed follows. An underlying function to be 
approximated was chosen. The simulated pressure height field 
described by Koehler (1979) was used, at the 500 mb level, with 
random values for two parameters, © 0 (chosen uniformly distri- 
buted on [-1 12.5°, -82.5°] ) , and A© (chosen uniformly distributed 
on [-15°, 15°]). One possible realization of the field is shown 
in Figure 4. The underlying field was then evaluated on a rec- 
tangular grid. Normally distributed first-guess errors with 
specified spatial covariance were generated and added to the 
field values to obtain the first-guess values. Then, the under- 
lying field was evaluated at a set of observation points, and 
normally distributed independent observation errors with speci- 
fied variance were added to these values to obtain observation 
values. An objective analysis scheme was then applied using the 
first-guess and observation values to obtain estimates of the 
underlying field at the grid points; these are called the ana- 
lyzed values of the field. The errors in the analyzed values 
were then computed. After repeating the process for many reali- 
zations, estimates of the root-mean-square error was obtained. 



5 



